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Measuring the primordial power spectrum: Principal 
component analysis of the cosmic microwave background 
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ABSTRACT 

We implement and investigate a method for measuring departures from scale- 
invariance, both scale-dependent as well as scale-free, in the primordial power spec- 
trum of density perturbations using cosmic microwave background (CMB) C'i data 
and a principal component analysis (PCA) technique. The primordial power spec- 
trum is decomposed into a dominant scale-invariant Gaussian adiabatic component 
plus a series of orthonormal modes whose detailed form only depends the noise model 
for a particular CMB experiment. However, in general these modes are localised across 
wavenumbers with 0.01 < k < 0.2 Mpc - , displaying rapid oscillations on scales cor- 
responding the acoustic peaks where the sensitivity to primordial power spectrum is 
greatest. The performance of this method is assessed using simulated data for the 
Planck satellite, and the full cosmological plus power spectrum parameter space is 
integrated out using Markov Chain Monte Carlo. As a proof of concept we apply this 
data compression technique to the current CMB data from WMAP, ACBAR, CBI, 
VSA and Boomerang. We find no evidence for the breaking of scale-invariance from 
measurements of four PCA mode amplitudes, which is translated to a constraint on the 
scalar spectral index ns(k = 0.04 Mpc -1 ) = 0.94 ± 0.04 in accordance with WMAP 
studies. 

Key words: cosmology: observations - cosmic microwave background - large-scale 
structure of the universe - methods: data analysis 



1 INTRODUCTION 

Observations of the cosmic microwave background (CMB) 
anisotropies are presenting a fascinating opportunity for dis- 
cerning between our models for the origin of structure in 
the universe in great detail. Indeed the most recent observa- 
tions of the CMB from the Wilkinson Microwave Anisotropy 
Probe ( WMAP) have vindicated a basic picture for the 
primordial perturbations which are nearly scale-invariant, 
Gaussian and adiabatic in nature, and which are domi- 
nated by a passive and growing-mode. This represents enor- 
mous progress by instrumentalists in the thirty years since 
Zel'dovich and Novikov lamented in their 1975 monologue 
over the observational prospects for measuring the CMB 
anisotropies: 'Given all the difficulties, it is not clear that 
we will ever successfully investigate the nature of the initial 
perturbations using the concept of [Sakharov] modulation [of 
the acoustic peaks]' (Zel'dovich & Novikov 1975). 

At this time, therefore, there is an overall consistency 
between observations (Peiris et al 2003; Barger, Lee & 
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Marfatia 2003; Leach & Liddle 2003) and the inflationary 
paradigm which is well-known to contain a mechanism for 
generating large-scale perturbations of this type (see Liddle 
& Lyth 2000; Dodelson 2003). In the near future though, 
most progress in our understanding of the origin of structure 
is likely to come from empirical studies of the primordial per- 
turbations where one of the known ingredients of the stan- 
dard Gaussian adiabatic model is relaxed to a more general 
form. Indeed, this has been the spirit in which many authors 
have proceeded. In particular there has been a strong inter- 
est in measuring the shape of the primordial power spec- 
trum, given the prospect of a factor twenty or so increase 
in the data to this sector of cosmology in the near future, 
coming from ground-based, balloon-borne and satellite ex- 
periments. 

Model-independent methods for reconstructing the pri- 
mordial power spectrum are being investigated where one 
only relies on the broad assumption that the overall picture 
of Gaussian adiabatic perturbations is correct. The available 
data are then confronted a more general primordial power 
spectrum sector, and the full parameter space is integrated 
out in a medium size computation. Many such power spec- 
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trum parametrisations exist and these include bandpowers 
(Wang, Spergel & Strauss 1999; Bridle et al 2003; Hannestad 
2004), band-colours (Bridle et al 2003), wavelet bandpowers 
(Mukherjee & Wang 2003a, c), orthogonal wavelets (Mukher- 
jee & Wang 2003b). The specific choices to be made such as 
the number and the location of the bandpowers will require 
a certain amount of optimisation. However, these promis- 
ing methods are known to perform well on both real and 
simulated data without degrading too far the expected con- 
straints on the remaining cosmological parameters (Bond 
et al 2004; Mukherjee & Wang 2005). 

One can also apply inverse methods in order to recon- 
struct the primordial power spectrum, since the problem at 
hand is akin to deconvolution. Many methods have been 
investigated and these include semi-analytic iterative meth- 
ods (Kogo et al 2005), the Richardon-Lucy deconvolution 
algorithm (Shafieloo & Souradeep 2004), regularised least- 
squares (Tegmark & Zaldarriaga 2002; Tocchini-Valentini, 
Douspis & Silk 2005). While these strategies may provide a 
reasonable glimpse of the form of the primordial power spec- 
trum at a lower computational cost, they typically suffer a 
weakness that the cosmological parameters must be fixed to 
some representative values and are not integrated out. In 
addition there is usually a smoothing step involved either 
in the data or the deconvolved power spectrum requiring a 
careful treatment. 

There is a data compression strategy which, although 
it is most similar in spirit to the model-independent meth- 
ods described above, corresponds to asking a a slightly dif- 
ferent question than whether we can reconstruct or decon- 
volve the primordial power spectrum. Although the question 
we refer to has been in the air and in the minds of many 
people for years, and is partially addressed by any CMB 
analysis that constrains the power-law slope of the primor- 
dial power spectrum, it is worth stating it here explicitly: 
Are scale-invariant adiabatic perturbations an ingredient of 
our cosmology and how can we best measure any departures 
from scale-invariance? This question is important because 
its eventual answer will represent the next step in our at- 
tempts to model and understand the underlying mechanism 
responsible generating the primordial perturbations. We will 
demonstrate in this paper that principal component analy- 
sis is very well suited for this purpose. Briefly summarised, 
the trick is to choose a complete orthonormal power spec- 
trum basis which also reflects our expectation of where the 
departures from scale-invariance are likely to be best probed 
by the data, as has been repeatedly emphasised by Hu and 
collaborators (Hu & Okamoto 2004; Kadota et al 2005). The 
full cosmological plus power spectrum parameter space can 
be integrated out in a medium to large scale computation, 
and theoretical predictions for the power spectrum can be 
easily projected on onto the same power spectrum basis to 
make the comparison with observations. 

The outline of this paper is to describe the principal 
component analysis formalism, providing a commentary of 
the relevant implementation details in J5| in Sj^Jwe test the 
method with simulated Planck data using three primordial 
power spectra which are respectively scale-invariant, scale- 
free, and broken scale-invariant; in f^Jwe a PPly the method 
to the WMAP data before concluding jS] 



2 PRINCIPAL COMPONENT ANALYSIS 
FORMALISM 

In this paper we implement and investigate the principal 
component analysis (hereafter PCA) method detailed and 
described by Hu and Okamoto (2004) (hereafter HO04) 
which should be considered a companion paper. PCA has 
also been applied or discussed in countless other contexts 
in which data volumes have already or will soon be seeing 
sharp increases, for instance in galaxy-galaxy power spec- 
trum estimation methods (Hamilton and Tegmark 2000), 
reionization history reconstruction (Hu and Holder 2003), 
dark energy reconstruction (Huterer & Starkman 2003) and 
most recently in the context of reconstructing the inflation 
potential (Kadota et al 2005). It can be thought of simply 
as a change of parameter basis, where the rotation is deter- 
mined by properties of the observed or expected signal and 
noise. At the same time it is also a very useful lossless data 
compression technique. 

The basic set-up in the context of the CMB is not at all 
unfamiliar to astrophysics, that of a deconvolution problem 
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where X — T,E and the dependence of the CMB transfer 
functions T x (k) on the cosmological parameters {uji} has 
been written explicitly in order to show the added compli- 
cation over and above an ordinary deconvolution problem 
of this type. Interestingly, there is a satisfactory solution to 
the problem of extracting the primordial power spectrum 
V(k), described in HO04, which involves exploiting what we 
know about the expected noise on Ce and our precise and 
accurate knowledge of the CMB transfer function physics 
(Seljak et al 2003). Here we present the relevant equations 
from HO04. 

The response of the Ci with respect to some primordial 
power spectrum parameters {pi} can be investigated via a 
mode counting approach by constructing the Fisher infor- 
mation matrix 



(2) 



which has been written using a matrix notation, where 
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We can take our power spectrum test function Wi to be the 
triangle window 
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In this work we have used a discretisation A In k = 0.00875 
spanning a range of scales that traverses the acoustic peaks 
from 0.004 < k < 0.2 Mpc -1 . It is worth noting at this stage 
that this range need not include the largest scales respon- 
sible for the Sachs- Wolfe effect: the Fisher information on 
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Figure 1. Illustrating Fij, given by equation J2J, for the Planck 
satellite, which displays a band-diagonal structure with peaks 
in sensitivity corresponding to the temperature acoustic peaks. 
Here the discretisation is A In k = 0.00875. The bandwidth of the 
Fisher matrix, <51nfc ~ 0.05, determines the maximum achievable 
resolution for the recovery of the primordial power spectrum. 



these scales tends to zero, and so it proves convenient to 
truncate these scales in order to later on invert the Fisher 
information matrix without numerical difficulties. The cal- 
culation of the power spectrum transfer functions D^ x 
is achieved by making minor modifications to the CAMB 
CMB anisotropics code (Lewis, Challinor & Lasenby 2000) 
(based on CMBFAST (Seljak & Zaldarriaga 1996)), rather 
than using a full Boltzmann hierarchy code used in HO04. 
CAMB is run at slightly higher accuracy where we have 
increased by a factor four both the number of source and 
integration k modes, and have calculated D x x at every £, 
rather than the usual splining method with A£ ~ 50, in 
order to capture the high frequency information. 

The choice of fiducial cosmological parameters is given 
by a baryon density Qsh 2 = 0.024, cold dark matter density 



Q D h 2 = 0.121, present Hubble rate H [kms _1 Mpc _1 ] = 72, 
optical depth to last scattering r = 0.17, and a curvature 
perturbation amplitude Po — 23 x 10 -10 . We assume a spa- 
tially flat cosmology and ignore the effect of lensing. The 
latter will be important to take into account in a more thor- 
ough analysis in order avoid biasing of the recovered cosmo- 
logical parameters (Hu & Okamoto 2004; Lewis 2005). 

In Fig. we illustrate the Fisher information matrix 
given by equation J2J which shows a band-diagonal struc- 
ture with peaks of sensitivity to the primordial power spec- 
trum on scales corresponding to the acoustic peaks; the sen- 
sitivity drops again on scales corresponding to the acoustic 
troughs, which can be remedied by information coming from 
the phase-shifted polarization peaks. Of course the sensitiv- 
ity tends to zero on large scales due to a lack of modes to 
observe, and on small scales due to Silk damping and beam 
smoothing, since the Gt of equation J5J is replaced by the 
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Figure 2. Illustrating Planck's window of sensitivity to the pri- 
mordial power spectrum with and without polarisation (upper 
curve). Here a p gives the approximate la error on measure- 
ments of the primordial power spectrum using bandpowers with 
<51nfc ~ 0.02 — > 0.05. The vertical lines indicate the position of the 
temperature acoustic peaks. The cosmological parameters have 
been fixed, so some degrading of the sensitivity is expected. 



total signal plus a Gaussian white noise model adjusted for 
a given experiment 
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where o" 2 i sc i s the noise variance in (/iK-rad) 2 and 8 is the 
FWHM of a Gaussian beam in radians. The noise model 
should be considered an important input to the analysis 
since it determines the range of scales that will be probed; it 
is an additional ingredient compared to the majority of anal- 
yses of the Gt data. We use here a noise model for Planck 
with o"n oisc = 3 x 10 _4 (/iK-rad) 2 and 8 = 7', and a noise 
model for WMAP with cr 2 oisc = 8.4 x 10~ 3 (^K-rad) 2 and 
8 — 13'. In a realistic analysis the observed signal plus noise 
spectrum will be more appropriate. 

As usual the Fisher information matrix can be inverted 
to obtain a covariance matrix Cy whose diagonal compo- 
nents provide a useful estimate, the Cramer-Rao bound, of 
the expected variance of the parameters pi with 



2 {pi) = Cu » (F )i 
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In Fig. [5]we plot this window of sensitivity to the primordial 
power spectrum (on a scale Sink ~ 0.05 set by the Fisher 
matrix bandwidth) for the Planck satellite, which can be 
seen to encompass the entire acoustic peak region. As noted 
in HO04, outside this range of scales, and in particular on 
large scales, we can only hope to recover wide-band (8 In k 3> 
0.05) averages of the primordial power spectrum at high 
accuracy. 

The PC A basis {mi} is simply a linear combination of 
the power spectrum spike basis {p;} 



m a 



(A In k) 1/2 Y^S la Pi 
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where the Si a are the orthonormal eigenvectors of the co- 
variance matrix. We can then work with a set of normalised 
principal components Si a = Si a /V A In k (hereafter the PCA 
modes) which will have unit variance when integrated over 
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Figure 3. Illustrating PCA modes 1—4 which have been generated 
assuming the WMAP noise model. The vertical lines indicate the 
position of the temperature acoustic peaks. 
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Figure 4. Illustrating PCA modes 17-20, as in Fig.|3] The oscil- 
lations are strongest in the vicinity of the acoustic peaks where 
the sensitivity to the primordial power spectrum is greatest. 



lnfc. In Figs. |3| and 2] we plot examples of the PCA modes 
with mode number from 1-4 and 17-20 respectively, gener- 
ated using the WMAP noise model. The oscillations in the 
PCA modes become increasingly rapid at scales correspond- 
ing to the acoustic peaks where sensitivity to the primordial 
power spectrum is greatest, that is until we hit the numer- 
ical resolution. At this point the PCA modes branch into 
two wavepacket-like solutions travelling towards large and 
small scales, similar to the behaviour noted by Hamilton 
and Tegmark (2000), although this need not worry us. Note 
also that the PCA modes are invariant under changes in 
the discretisation scale Alnfc. However, we found that in 
order to obtain sensible estimates of the eigenvalues (pro- 
jected errors) of the PCA modes themselves, the Fisher ma- 
trix should be discretised on a scale that renders it roughly 
diagonal, instead of band-diagonal. 

The PCA modes can be straightforwardly integrated 
into the publicly available Markov Chain Monte Carlo 
(MCMC) package CosmoMC 1 (Lewis & Bridle 2002, Febru- 
ary 2005 version) in order to explore the full cosmological 
plus power spectrum posterior parameter space. Specifically, 
we use the following power spectrum ansatz 
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+ 22 m aSa(k), 



(9) 



where we take Vo — 23 x 10 _ , which should be cali- 
brated from observations. Clearly if the underlying primor- 
dial power spectrum is close to scale-invariant then equation 
admits a solution 

Tn a = 0, Va <4> Scale-invariance. (10) 

More generally equation @ is strongly suggestive of a gen- 
eral linear orthonormal model plus a noise term (see for 
instance Bretthorst 1988). In this way we are attempting to 
measure the spectrum of departures from scale-invariance 
which we call AP /Vo and which is given by the second term 
in equation @; in this context the dominant scale-invariant 
component mo is a Gaussian noise term. 

Concerning the numerical implementation of the power 
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spectrum equation @, we simply perform a linear spline in 
lnfc over the discrete PCA modes Si a , which are added to- 
gether before the final convolution with CMB transfer func- 
tions to obtain the Ci\ outside the PCA mode fc-range the 
second term of equation @ is dropped. We checked that the 
default fc-source and fc-integration settings for CAMB mod- 
ified to calculate Ci at At = 3 is accurate enough handle 
around the first forty modes of our current implementation; 
at this stage this is more than enough since we will only 
attempt to perform the MCMC with a maximum of sixteen 
PCA modes. 

Having obtained measurements of the PCA mode am- 
plitudes from the MCMC, it is then straightforward to 
project any power spectrum model, for instance a power- 
law spectrum, onto the PCA modes via 



AP 

in,, = I d\nkS a {k)-^-{k), 
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in order to make the comparison with observations. 

We can easily make an empirical estimate of the to- 
tal signal to noise of the measured departures from scale- 
invariance 
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where (m a ) and j„ a are the mean and variance of the in- 
dividual mode amplitudes obtained from the MCMC. As 
noted by Kadota et al (2005), the PCA modes can be safely 
truncated as soon as S/N saturates, assuming that the un- 
derlying primordial power spectrum is a reasonably smooth 
function. Incidentally, the total S/N represents a useful fig- 
ure of merit for optimising future CMB experiments to mea- 
sure the primordial power spectrum sector. Other measures 
such as "risk" (Huterer & Starkman 2003) and Bayesian 
evidence (see for example MacKay 2003) could be used to 
provide a rationale for truncating the PCA mode amplitudes 
even further, given a power spectrum model of interest. 

In the case that the recovered PCA mode amplitudes 
encode some complex information which can not be easily 
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Figure 5. Illustrating the recovery of the first eight mode ampli- 
tudes from simulated Planck data with an input scale-invariant 
spectrum. Plotted are the marginalised la error bars obtained 
from MCMC. The models (dashed lines) are for power-law spec- 
tra with n s (k = 0.05 Mpc" 1 ) = {0.99, 1, 1.01} (top to bottom, 
mode 1). 
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Figure 6. Illustrating the estimated departures from scale- 
invariance in fc-space on a narrow-band scale 5 In k ~ 0.02 for the 
case of an input scale-invariant spectrum. The solid curves show 
the estimated lcr error bars, given by equation 1141 . A scale- 
invariant spectrum within the acoustic peak region is strongly 
favoured. 



understood in the framework of power-law spectra, then it 
would be useful to obtain an estimate of AV/Po in fc-space 
in order to aid the process of modelling the power spectrum. 
Here we use an estimator 



AVjkj) 
Po 



(m a ) S a (ki 



(13) 



and for the purposes of a comparison with the input spec- 
trum, we estimate the noise variance via 



principal components will be "orthogonalized" to the effect 
of the cosmological parameters. In terms of implementation, 
one can use the matrix partitioning formulas (see for exam- 
ple Press et al 1992, 6 Ruanaidh & Fitzgerald 1996) to 
derive a "degraded" Ff C£ subblock 
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B T F a , b B. 
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We will make use of this in the next section. 
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where Cu is the covariance matrix, obtained from equation 
(J7J, accounting for the overall uncertainty in the narrow- 
band determination of AP /Po in regions of lower sensitivity 
on large scales, small scales, and in the temperature acoustic 
trough regions. 

A bandpower representation of the primordial power 
spectrum could also obtained from the measured PCA mode 
amplitudes via a Monte Carlo procedure; in this case the 
Fisher information matrix could be used for guidance when 
choosing the location and widths of the bands. Obviously 
though, no further quantitative information about the pri- 
mordial power spectrum can be gleaned in this way. 

One final point worth making in this section concerns 
how one should deal with the inevitable degeneracies be- 
tween the effect on the Ce due to the cosmological param- 
eters and the PCA power spectrum parameters, which will 
induce undesired off-diagonal components in the PCA co- 
variance matrix. We sketch here the solution given in HO04: 
One must first form the joint Fisher information matrix, 
F uv , for both power spectrum parameters and cosmological 
parameters 



p. 



B 

Fab 



(15) 



where F a t, is the usual cosmological parameter Fisher infor- 
mation matrix (see for example Tegmark, Taylor and Heav- 
ens 1997) and B are the cross terms. After inverting the 
full F uv to obtain a new covariance matrix C^v, one simply 
retains the power spectrum parameter subblock C\j, whose 



3 TESTS WITH SIMULATED PLANCK DATA 

As a means of gaining experience with the PCA method 
we generate simulated Planck data up to an £ max = 2250 
using the Gaussian white noise model of equation © for a 
cosmological model with parameters QbIi 2 — 0.024, Quh 2 — 
0.121, H = 72, t = 0.17, and P = 2.3 x 10~ 9 , which for 
simplicity are the same as those used to generate the PCA 
modes themselves. In a realistic data analysis scenario, the 
PCA modes would be generated with parameters close to the 
best-fit obtained from a traditional parameter determination 
approach. We consider three cases for the primordial power 
spectrum which is taken to described by a scale-invariant 
spectrum, a power-law spectrum with spectral index ns = 
0.97 and pivot scale ko = 0.05 Mpc -1 , and then finally a 
broken scale-invariance model with a Gaussian bump in the 
acoustic peak region 



■P(k) 
Po 



= 1 + 0.1 exp 



In [fc/0.08Mpc~ 
03 



(17) 



We then perform MCMC over the full cosmological plus 
PCA mode parameter space using the simulated data up to 
an Cax = 2000. We have also varied the number of modes 
included in the analysis from zero to sixteen in steps of four 
in order to study the effect of truncating the PCA expansion 
on the recovery of the cosmological parameters. 

The development of CosmoMC (Lewis & Bridle 2002) 
has reached a maturity that is very well suited to an analysis 
of this type where the number of power spectrum parameters 
begins to dominate over the number of cosmological parame- 
ters, but where we nonetheless expect by construction to ob- 
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Figure 7. Illustrating the recovery of the first ten principal com- 
ponent amplitudes from simulated Planck data with an input 
ng = 0.97 spectrum, as in Fig. |S| The models (dashed lines) 
correspond to power-law spectra with ng(fco = 0.05 Mpc -1 ) = 
{0.97, 0.975} (bottom to top, mode 3). The compressed CMB data 
can not be fit by m a = and so scale-invariance would be ruled 
out at high signal to noise. 



tain a stable multivariate Gaussian posterior solution. As a 
result we have taken full advantage of a conjugate gradients 
descent module which estimates the covariance and location 
of the posterior peak before the MCMC begins, thus allevi- 
ating the potential challenge working with so many param- 
eters while also conserving some computing resources. On 
this note, the total number of Ce likelihood evaluations re- 
quired in our tests in the following section rises from around 
Afc = 2 x 10 4 — ■> 10 6 for zero and eight PCA modes respec- 
tively, and then tends to saturate at around this number. 
It seems reasonable that the number of likelihood evalua- 
tions ought not to exceed by much £ 2 , the total number of 
modes upon which the the Ce spectrum depends. Moreover 
the 'fast-slow' split between power spectrum and cosmolog- 
ical parameter likelihood evaluation speeds, already imple- 
mented in CosmoMC, will be of increasing benefit as we 
attempt to measure up to perhaps thirty PCA mode ampli- 
tudes in the future (Kadota et al 2005). 



3.1 The scale-invariant case 

In Fig. Q>] we illustrate the recovery of the first eight mode 
amplitudes for the ns = 1 case and make comparison for 
the theoretical prediction for the mode amplitudes which 
are obtained by projecting some representative power-law 
spectra onto the PCA modes via equation (I I 1 1 ; we find that 
the scale-invariant solution m a — is very well recovered. 
Here it is worth mentioning that the Gaussian realisation for 
the simulated Planck data sets was taken to be the exact Ce 
model, which explains why the recovery of the PCA mode 
amplitudes shows very little scatter around m a — 0. One 
can see that the first three PCA modes provide the bulk of 
constraining power for smooth power-law spectra leading to 
a constraint which will be roughly ns = 1 ±0.01, consistent 
with typical parameter forecasts in the literature. 

We illustrate an estimate of the departures from scale- 
invariance AV/Vo in Fig. |S| and the region with the most 
data weight can clearly be discerned showing consistency 
with a scale-invariant spectrum. In this case the recovery 
of the cosmological parameters is also excellent, and we re- 
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Figure 8. Illustrating the estimated departures from scale- 
invariance in fc-space for the case of an input ng = 0.97 spectrum 
(inclined dashed line), as in Fig.|S| A tilt is recovered in the region 
k = 0.06 - 0.1 Mpc -1 with enough signal to noise to overrule the 
assumption of scale-invariance in model equation fffl . 



covered a stable Gaussian posterior (as a function of the 
number of PCA modes) with constraints given by ajsh 2 = 
0.0240 ± 0.0002, u D h 2 = 0.121 ± 0.02, H = 71.9 ± 0.7, 
r = 0.170 ± 0.005, V /To = 1.00 ± 0.01 for the case of us- 
ing eight PCA modes. Clearly the PCA method works well 
under these most idealised of circumstances. 



3.2 The scale-free ns = 0.97 case 

The ns = 0.97 case is more delicate since we know in ad- 
vance that the power spectrum model equation © will not 
be able to accurately describe a tilted spectrum on large or 
small scales. We can therefore expect some biasing in the re- 
covery of the cosmological parameters which will necessarily 
adjust to provide the overall excess of power on large scales 
relative to small scales; this is just the usual degeneracy be- 
tween cosmological and power spectrum parameters. 

In fact to get reasonable results at all, we found it nec- 
essary to apply equation 1161 in order to orthogonalise the 
PCA modes to the effect of the primordial power spectrum 
amplitude Vo- The qualitative effect on the PCA modes is 
the the positive definite mode 1 is removed. Having modified 
the PCA modes in this way, the cosmological parameters are 
recovered as to B h 2 = 0.0247 ±0.0002, uj h 2 = 0.116±0.001, 
H = 74.6 ± 0.7, r = 0.183 ± 0.006, T/Vo = 1.02 ± 0.01 for 
the case of using eight PCA modes, showing biases at the 3a 
to 4a level. The fact that the recovered dark matter density 
shifts from Q^h 2 = 0.113 ± 0.001 -> 0.116 ± 0.001 as the 
number of PCA modes is increased provides a useful indica- 
tion that there are problems afoot with our power spectrum 
model equation @. 

Interestingly however, the PCA mode amplitudes are 
still very well recovered, and we illustrate in Fig. |7| that 
the first ten mode amplitudes, if somewhat attenuated in 
amplitude, provide strong evidence for a power-law primor- 
dial power spectrum, showing a distinctive pattern deviating 
from scale-invariance, m a = 0. The corresponding depar- 
tures from scale-invariance are shown in Fig. |H| where the 
recovered power spectrum shows strong evidence for a tilt, 
modulo some attenuation and oscillations in regions of lower 
sensitivity. In short there is enough signal to noise to over- 
rule our assumption of scale-invariance, supplying us with 
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Figure 9. Illustrating the recovery of the first sixteen princi- 
pal component amplitudes from simulated Planck data with an 
input Gaussian bump primordial power spectrum, as in Fig. |S| 
The models (dashed lines) correspond to the Gaussian bump of 
equation G3, centred on k = {0.082, 0.08, 0.078} Mpc" 1 (top to 
bottom, mode 2). 

strong evidence that model of equation (0 needs refining. It 
is likely that in a more refined analysis, one should orthog- 
onalise the PCA modes to the effect of the spectral index 
and the other cosmological parameters in order to recover 
unbiased estimates of the cosmological parameters. 



3.3 The Gaussian bump case 

Although completely contrived, this is perhaps the most 
interesting and challenging case since the input primor- 
dial power spectrum now contains distinct feature within 
the acoustic peak region. We illustrate in Fig. |U] that the 
first sixteen PCA amplitudes are nonetheless rather well re- 
covered and are consistent with the input Gaussian bump 
model. In this case we can see that, for instance, the sec- 
ond PCA mode strongly constrains the central position of 
the feature in fc-space. In Fig. 1101 we show that a bump 
like feature has indeed been recovered, again modulo some 
attenuation and oscillations in regions of lower sensitiv- 
ity. The cosmological parameters are also very well recov- 
ered with uj B h 2 = 0.0238 ± 0.0002, cj u h 2 = 0.122 ± 0.002, 
H = 71.6 ± 0.9, r = 0.170 ± 0.005, V /Vo = 1-00 ± 0.01. 
This represents an interesting success for the PCA method. 



3.4 Summary and discussion 

To summarise the tests so far, the PCA method has been 
demonstrated here to be very suitable and effective for mea- 
suring departures from scale-invariance, both scale-free and 
scale-dependent, in the most data- weighted regions of the 
Ce spectrum. In a realistic data analysis setup the recov- 
ered PCA mode amplitudes, together with the PCA modes 
themselves will represent an extremely powerful compres- 
sion of our information concerning the primordial power 
spectrum. At first sight this may represent an unnecessary 
data analysis stage compared the usual parameter determi- 
nation methods where one fits to the Ce data directly using 
the power spectrum model parameters on the same footing 
as the other cosmological parameters. However, the point 
here is to obtain first a detailed picture of the most im- 
portant departures from scale-invariance in the primordial 
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Figure 10. Illustrating the estimated departures from scale- 
invariance in fc-space for the case of an input scale-invariant plus 
Gaussian bump spectrum (dashed curve), as in Fig. A dis- 
tinct bump like feature is recovered in the acoustic peak region. 
Precision polarization data would be required in order to better 
recover the feature in between the third, fourth, and fifth tem- 
perature acoustic peak scales (vertical dotted lines). 

power spectrum while at the same time being able to weigh 
up the relative importance as well as locating both in k and 
£ space any possible glitches or residual systematic effects 
in the Ce data; then in the final data compression stage we 
can use the PCA mode amplitudes to rapidly test any wide 
class of specific power spectrum models with great ease and 
without recourse to any further Ce likelihood evaluations, as 
was recently emphasised by Kadota et al (2005) for the case 
of inflation models. 



4 APPLICATION TO THE WMAP DATA 

In this Section we apply the PCA method to the currently 
available temperature and temperature-polarization cross 
correlation spectra from WMAP (Kogut et al 2003; Verde 
et al 2003; Hinshaw et al 2003) and bandpowers in the range 
600 < I < 2000 from the VSA (Grainge et al 2003; Dickin- 
son et al 2004) ACBAR (Kuo et al 2004), CBI (Pearson et al 
2003; Readhead et al 2004) and Boomerang B2K (Jones et al 
2005, Piacentini et al 2005, Montroy et al 2005) instruments. 

To emphasise once more, we are working within the 
framework of spatially flat ACDM cosmologies, described 
by five basic cosmological parameters: the baryon density 
Qsh 2 , the cold dark matter density Q,oh 2 , the optical depth 
to last scattering r, the ratio of the sound horizon to angular 
diameter distance at last scattering d = 100r* /D* (instead 
of Ho) and the overall amplitude of scalar perturbations 
Vo- In addition we throw into the mix the first four PCA 
modes generated with a noise model for WMAP given by 
c-noiso = 8.4 x 10~ 3 (^K-rad) 2 and 6 = 13'. 

The measured amplitudes of the first four modes of 
Fig.|2]are displayed in Fig. llll with the corresponding power 
spectrum in Fig. 1121 The broad picture painted here is that 
we find no evidence for the breaking of scale-invariance: 
the mode amplitudes are very well fit my m a = 0. Only 
a single mode on scales corresponding to the second acous- 
tic peak shows an S/N > 1, which is barely worth men- 
tioning aside from the fact that it can easily be accommo- 
dated by a slightly red primordial power spectrum: project- 
ing power-law primordial power spectra onto the PCA ba- 
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Figure 11. Illustrating the current PCA measurements using 
current data from WMAP, VSA, ACBAR, CBI and Boomerang. 
The compressed CMB data are well fit by m a = and so show no 
evidence for breaking of scale-invariance. The dashed lines show 
power-law models with ng(fco = 0.04 Mpc -1 ) = {1.0,0.94} (top 
to bottom). 
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Figure 12. Illustrating the estimated departures from scale- 
invariance using current data from WMAP, VSA, ACBAR, CBI 
and Boomerang. The spectrum is scale-invariant showing only the 
slightest hint of a tilt. The best-fit spectrum with ng(fco = 0.04 
Mpc -1 ) = 0.94 ± 0.04 is shown (dashed inclined line) as well as 
the first, second and third acoustic peak scales (vertical dotted 
lines). 



sis and using a simple Gaussian likelihood function we find 
the constraint on the spectral index to be ns(fco = 0.04 
Mpc" 1 ) = 0.94 ± 0.04, displayed in Fig. [T3] and which is 
in accordance with conventional studies of the primordial 
power spectrum. It is also possible to make a detailed com- 
parison with the primordial power spectrum bandpowers 
from fig. 4 of Bridle et al (2003), as well as with orthog- 
onal wavelet expansion constraints in fig. 2 of Mukherjee & 
Wang (2003b). We all find the same very weak trend for 
a 20-30% drop in power between between the first acous- 
tic peak at k = 0.02 Mpc" 1 and the third acoustic peak 
scale at k — 0.07 Mpc -1 . Again, the trend is not so much 
interesting at this stage as the consistency between these 
complementary methods. 



5 CONCLUSIONS 

In this work we have implemented and investigated a prin- 
ciple component analysis (PCA) technique in order to study 
the possible departures from scale-invariance that may ex- 
ist in the spectrum of primordial curvature perturbations, 
which are observable via the CMB anisotropies. The essence 
of this method is to decompose the primordial power spec- 
trum into a scale-invariant component plus a series of or- 
thonormal modes which reflect our expectation of where the 
departures from scale-invariance are likely to be best probed 
by the data. The information from the CMB is then be com- 
pressed into a series of mode amplitudes which can easily be 
compared with predictions from any wide class of primordial 
power spectra without recourse to any further Ci likelihood 
evaluations. 

The method was first tested on simulated Planck data 
using an input scale-invariant spectrum and we observed 
good performance in the simultaneous recovery of cosmo- 
logical parameters and the principal component mode am- 
plitudes via an MCMC exploration of the full parameter 
space. In the case of simulated data from an input power- 
law spectrum with spectral index ns = 0.97, the recovery of 
the cosmological parameters was biased as they adjusted to 
provide an overall excess of large-scale to small-scale power. 
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Figure 13. Illustrating the posterior constraint on the spectral 
index ng(fco = 0.04 Mpc -1 ) = 0.94±0.04 obtained from the four 
PCA mode amplitudes displayed in Fig. 1111 

However, the biasing is evidenced by fluctuating cosmologi- 
cal parameter constraints as the number of power spectrum 
principal components is increased. Moreover, the PCA mode 
amplitudes were still very well recovered, showing strong ev- 
idence for a tilted primordial power spectrum and providing 
enough signal to noise to overrule our assumption of scale- 
invariance. Thus PCA can be used as a self-consistent means 
for justifying a more refined power spectrum model than the 
one considered here in equation We also demonstrated 
that the PCA method is capable of measuring departures 
from scale-free spectra by considering simulated data from a 
primordial power spectrum containing a 10% gaussian bump 
in the acoustic peak region, and observing good recovery of 
both the PCA mode amplitudes and the cosmological pa- 
rameters. 

Finally, as a proof of concept of the method we provided 
a first glimpse of the principal component mode amplitudes 
that can be obtained from the currently available CMB data 
from WMAP, VSA, ACBAR, CBI and Boomerang. We ob- 
tained measurements of the first four principal components 
corresponding to scales across the first and second acous- 
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tic peaks, finding no evidence for the breaking of scale- 
invariance with only a hint of a red primordial power spec- 
trum with spectral index ns(fco = 0.04 Mpc -1 ) = 0.94±0.04, 
consistent with other studies in the literature, with a total 
signal to noise at not more than S/N ~ 2.5. 

Assuming that the Gaussian adiabatic density pertur- 
bation scenario continues to hold as our observations of the 
CMB improve in the near future, then we will soon move 
into the regime where the information about the primor- 
dial power spectrum will completely outweigh the informa- 
tion about the cosmological parameters which become, from 
this perspective, well-understood nuisance parameters to be 
carefully integrated out. It seems very likely that principal 
component analysis, or else another very similar data com- 
pression technique, will be essential for fully exploiting the 
forthcoming temperature and polarization Ce data. 
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